Skip to content

feat: Implement queue-based async DB export pipeline#188

Open
renganathc wants to merge 24 commits into
geturbackend:mainfrom
renganathc:feature/db-export-pipeline
Open

feat: Implement queue-based async DB export pipeline#188
renganathc wants to merge 24 commits into
geturbackend:mainfrom
renganathc:feature/db-export-pipeline

Conversation

@renganathc
Copy link
Copy Markdown
Contributor

@renganathc renganathc commented May 21, 2026

🚀 Pull Request Description

Fixes #176

Implemented asynchronous database exports using BullMQ workers and added a dedicated consumer service for processing export jobs and sending download emails.

One small deviation from the initial plan is that exports are first written to a temporary local file before being uploaded to storage, instead of directly streaming to cloud storage, since Supabase Storage does not properly support the intended streaming approach. Apart from this, the overall queue based export flow follows the planned design.

🛠️ Type of Change

  • 🐛 Bug fix (non-breaking change which fixes an issue)
  • ✨ New feature (non-breaking change which adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 📝 Documentation update
  • 🎨 UI/UX improvement (Frontend only)
  • ⚙️ Refactor / Chore

🧪 Testing & Validation

Backend Verification:

  • I have run npm test in the backend/ directory and all tests passed.
  • I have verified the API endpoints using Postman/Thunder Client.
  • New unit tests have been added (if applicable).

Frontend Verification:

  • I have run npm run lint in the frontend/ directory.
  • Verified the UI changes on different screen sizes (Responsive).
  • Checked for any console errors in the browser dev tools.

📸 Screenshots / Recordings

Screenshot 2026-05-21 at 10 59 19 AM Screenshot 2026-05-21 at 11 02 53 AM

✅ Checklist

  • My code follows the code style of this project.
  • I have performed a self-review of my code.
  • I have commented my code, particularly in hard-to-understand areas.
  • My changes generate no new warnings or errors.
  • I have updated the documentation (README/Docs) accordingly.

Built with ❤️ for urBackend.

Summary by CodeRabbit

  • New Features

    • Database export: requestable via new protected API endpoint with per-day limits (1 for standard, 5 for pro)
    • Exports are processed in background, uploaded to S3-compatible storage, and trigger an email with a 24‑hour download link
  • Chores

    • Added a consumer app to run export workers and deployment config
    • Integrated export queue and email flow; updated dev startup to include the consumer workspace

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

Warning

Review limit reached

@renganathc, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 22 minutes and 44 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bb43a560-bb4f-4840-9062-f03b8ba2f4dc

📥 Commits

Reviewing files that changed from the base of the PR and between 0e71cd0 and a18a81b.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (14)
  • apps/consumer/Dockerfile
  • apps/consumer/package.json
  • apps/consumer/src/index.js
  • apps/consumer/src/workers/export.worker.js
  • apps/dashboard-api/package.json
  • apps/dashboard-api/src/app.js
  • apps/dashboard-api/src/controllers/dbExport.controller.js
  • apps/dashboard-api/src/routes/projects.js
  • package.json
  • packages/common/src/index.js
  • packages/common/src/queues/emailQueue.js
  • packages/common/src/queues/exportQueue.js
  • packages/common/src/utils/emailService.js
  • packages/common/src/utils/storage.manager.js
📝 Walkthrough

Walkthrough

Adds an asynchronous database export feature: a Dashboard API endpoint with per‑day limits, shared BullMQ export queue and email plumbing, a consumer application with a streaming worker that uploads exports to S3‑compatible storage, and supporting storage helper and integration changes.

Changes

Database Export Pipeline

Layer / File(s) Summary
Shared Export Queue & Email Infrastructure
packages/common/src/queues/exportQueue.js, packages/common/src/index.js, packages/common/src/utils/storage.manager.js, packages/common/src/utils/emailService.js, packages/common/src/queues/emailQueue.js
Creates exportQueue, exports storage helper, adds sendExportReadyEmail, and updates email worker to handle send-export-email.
Dashboard API Export Endpoint & Rate Limiting
apps/dashboard-api/package.json, apps/dashboard-api/src/controllers/dbExport.controller.js, apps/dashboard-api/src/routes/projects.js, apps/dashboard-api/src/app.js
Adds POST /api/projects/:projectId/export with owner verification, developer lookup, Redis daily rate limiting (5 for pro, 1 default), job enqueueing, and 202 response with remaining usage; adds bull-board deps.
Consumer App Setup & Startup
apps/consumer/Dockerfile, apps/consumer/package.json, apps/consumer/src/index.js
Adds consumer Dockerfile and package.json; startup loads env, conditionally validates, initializes export worker, and registers graceful shutdown handlers.
Export Worker: Streaming & Storage Upload
apps/consumer/src/workers/export.worker.js
Implements BullMQ worker (concurrency 2) that streams MongoDB collections via cursor into a PassThrough piped to AWS SDK Upload to an S3-compatible bucket, generates 24h signed URL, enqueues export email, and logs job events.
Storage Helpers & Development Integration
packages/common/src/utils/storage.manager.js, packages/common/src/index.js, package.json
Adds getS3CompatibleStorage(project) with provider validation and error logging for unknown provider, and updates root scripts.dev and @aws-sdk/lib-storage version.

Sequence Diagram

sequenceDiagram
  participant Client
  participant DashboardAPI as Dashboard API
  participant ExportQueue as Export Queue (BullMQ)
  participant ConsumerWorker as Consumer Worker
  participant MongoDB
  participant Storage as S3-Compatible Storage
  participant EmailQueue as Email Queue
  Client->>DashboardAPI: POST /projects/:projectId/export
  DashboardAPI->>DashboardAPI: verify owner + check daily limit
  alt Limit exceeded
    DashboardAPI->>Client: 429 Too Many Requests
  else OK
    DashboardAPI->>ExportQueue: enqueue export job
    DashboardAPI->>Client: 202 Accepted + remaining usage
    ExportQueue->>ConsumerWorker: dispatch job
    ConsumerWorker->>MongoDB: cursor() on each collection
    loop per collection
      MongoDB-->>ConsumerWorker: stream documents
      ConsumerWorker->>ConsumerWorker: write JSON to PassThrough
    end
    ConsumerWorker->>Storage: Upload (multipart) via Upload(stream)
    Storage-->>ConsumerWorker: file location
    ConsumerWorker->>Storage: generate presigned URL (86400s)
    ConsumerWorker->>EmailQueue: enqueue send-export-email (downloadUrl, projectName)
    EmailQueue->>Client: sendExportReadyEmail with link
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

feature, backend

Suggested reviewers

  • yash-pouranik

Poem

🐰 In tunnels of code I quietly hop,

Streams and queues in ordered hop,
Exports flow out, a tidy trail,
Links zip off on a 24‑hour sail,
A carrot‑byte for every dev's map!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely summarizes the main change: implementing a queue-based async database export pipeline using BullMQ, which is the primary objective of the PR.
Linked Issues check ✅ Passed All core requirements from issue #176 are met: async BullMQ queue for export jobs (#176), POST endpoint at /:projectId/export with owner validation (#176), streaming with Mongoose cursors (#176), S3-compatible storage upload (#176), and export-ready email notification (#176).
Out of Scope Changes check ✅ Passed All changes are directly related to implementing the async database export pipeline. No extraneous modifications detected; all file additions and updates support the core feature objectives.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread apps/dashboard-api/src/controllers/dbExport.controller.js Fixed
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (1)
packages/common/src/utils/emailService.js (1)

316-321: ⚡ Quick win

Use configured sender/reply-to values for export emails.

Hardcoding these fields bypasses environment configuration and can cause inconsistent deliverability across environments.

Proposed change
-        const { data, error } = await resend.emails.send({
-            from: '"urBackend" <onboarding@resend.dev>',
+        const { data, error } = await resend.emails.send({
+            from: defaultFromAddress,
             to: to,
             subject: subject,
             text: textBody,
-            replyTo: 'urbackend@apps.bitbros.in',
+            replyTo: replyToAddress,
         });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/common/src/utils/emailService.js` around lines 316 - 321, The export
email send call currently hardcodes from and replyTo values ("urBackend"
<onboarding@resend.dev> and 'urbackend@apps.bitbros.in'); update the send
routine (the function that builds and calls the email client—look for the
send/export email function in emailService.js where the object with from, to,
subject, text, replyTo is passed) to use the configured sender and reply-to
values instead (read from the existing configuration/environment variables or
the module's mail config constants rather than literal strings) so all
environments honor configuration for deliverability.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/consumer/Dockerfile`:
- Around line 1-20: The Dockerfile is creating and running the container as
root; modify it to create a non-root user, chown the app directory to that user,
and switch to that user before CMD so the container runs unprivileged: add a
user creation step (e.g., addgroup/adduser or useradd) after the workspace and
package copy steps, run chown -R on /app (the WORKDIR used by RUN npm ci and
subsequent COPYs) to give ownership to the new user, and add a USER instruction
near the end so CMD ["npm","run","start"] runs as that non-root user.

In `@apps/consumer/package.json`:
- Around line 1-15: Add a "dependencies" section to apps/consumer/package.json
declaring the runtime packages the consumer imports: add "dotenv": "^17.2.3",
"bullmq": "^5.70.1", and "`@urbackend/common`": "*" (to match the monorepo
protocol/versions used elsewhere), then run the workspace install to update the
lockfile; update any existing script or import usage if package names differ.
Ensure the "dependencies" key is present alongside "scripts" and uses the exact
versions listed so module resolution works when the consumer workspace is
installed/run in isolation.

In `@apps/consumer/src/workers/export.worker.js`:
- Around line 44-47: The export loop currently ignores writeStream.write(...)
return values in the for await (const doc of cursor) loop (using the local
variable first) which can cause memory bloat; change it to check the boolean
result and, when false, await a Promise that resolves on
writeStream.once('drain') before continuing to apply backpressure. Also stop
calling fs.readFileSync(tempFilePath) before upload; instead stream the temp
file to Supabase (e.g., use fs.createReadStream(tempFilePath) or the Supabase
streaming/file API) so upload uses a stream rather than loading the whole file
into RAM (replace the fs.readFileSync usage and the subsequent
supabase.storage...upload(...) call to accept a stream).
- Around line 69-72: The code currently reads the entire temp file into memory
with fs.readFileSync(tempFilePath) before calling
supabase.storage.from(bucket).upload(storagePath, ...); replace the synchronous
read with a streaming upload by creating a ReadableStream
(fs.createReadStream(tempFilePath)) and passing that stream to
supabase.storage.from(bucket).upload so the file is uploaded without blocking
the event loop or loading the entire file into memory; keep the same
storagePath, bucket and contentType/upsert options and ensure any stream errors
are handled/rejected before completing the worker task (consider switching to
resumable/TUS uploads if exports may be very large).

In `@apps/dashboard-api/src/controllers/dbExport.controller.js`:
- Around line 40-50: The current get + incr sequence is race-prone; replace it
with a single atomic Redis operation (e.g. an EVAL Lua script or a Redis
multi/transaction) that increments the key, sets the 24h expiry when the counter
becomes 1, and returns the new count in one call; then check the returned count
against maxExports and call next(new AppError(...)) if it exceeds the limit
before calling exportQueue.add. Target the existing symbols key,
redis.get/redis.incr, maxExports, and exportQueue.add and ensure the atomic
script does: INCR key, if value == 1 then EXPIRE key 86400, return value.
- Around line 52-54: The response currently returns a message-only payload;
update the res.status(202).json call in the db export controller to follow the
standardized shape { success, data, message } — e.g. return
res.status(202).json({ success: true, data: { usageToday:
`${newCount}/${maxExports}`, newCount, maxExports }, message: "Database export
request received. You will receive an email with a download link shortly." });
ensure you modify the existing return in the export handler (the
res.status(202).json(...) statement) so callers receive success, data, and
message fields.
- Around line 57-58: The controller currently logs the full error but constructs
AppError using err.message which can leak internal/MongoDB details; keep the
console.error for internal diagnostics (including err) but change the AppError
instantiation to return a generic client-safe message (e.g., "Failed to initiate
database export.") instead of err.message, optionally attach an internal error
id or err.code to logs only; update the code around console.error and the new
AppError(...) call so next(new AppError(500, ...)) never includes err.message
while still logging err for debugging (referencing AppError, console.error,
req.params.projectId, next in the db export handler).

In `@packages/common/src/queues/emailQueue.js`:
- Around line 11-30: The worker currently treats unknown job.name values as
silent no-ops; update the handler so that after the known branches
(release-email and send-export-email) it explicitly rejects unsupported job
types by throwing an Error (or returning a failed Promise) that includes the
unrecognized job.name and job.data for observability; modify the function that
processes jobs (the code handling job.name in
packages/common/src/queues/emailQueue.js) to add a final else/throw path that
logs and throws a descriptive error like "Unsupported email job type:
<job.name>" so unknown jobs fail loudly.

In `@packages/common/src/utils/storage.manager.js`:
- Around line 145-147: The unknown-provider branch in getStorage
(storage.manager.js) logs the provider after throwing an Error, making
console.error unreachable; move or duplicate the diagnostic call so the provider
is logged before the exception is thrown (i.e., call console.error("[getStorage]
Unknown storage provider: ", provider) prior to throw new Error("Unknown storage
provider: " + provider)) so the provider is recorded when getStorage encounters
an unsupported provider.

---

Nitpick comments:
In `@packages/common/src/utils/emailService.js`:
- Around line 316-321: The export email send call currently hardcodes from and
replyTo values ("urBackend" <onboarding@resend.dev> and
'urbackend@apps.bitbros.in'); update the send routine (the function that builds
and calls the email client—look for the send/export email function in
emailService.js where the object with from, to, subject, text, replyTo is
passed) to use the configured sender and reply-to values instead (read from the
existing configuration/environment variables or the module's mail config
constants rather than literal strings) so all environments honor configuration
for deliverability.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5e4b2562-ec50-430a-9cee-645134229570

📥 Commits

Reviewing files that changed from the base of the PR and between 8b81ff4 and 4b50866.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (14)
  • apps/consumer/Dockerfile
  • apps/consumer/package.json
  • apps/consumer/src/index.js
  • apps/consumer/src/workers/export.worker.js
  • apps/dashboard-api/package.json
  • apps/dashboard-api/src/app.js
  • apps/dashboard-api/src/controllers/dbExport.controller.js
  • apps/dashboard-api/src/routes/projects.js
  • package.json
  • packages/common/src/index.js
  • packages/common/src/queues/emailQueue.js
  • packages/common/src/queues/exportQueue.js
  • packages/common/src/utils/emailService.js
  • packages/common/src/utils/storage.manager.js
💤 Files with no reviewable changes (1)
  • apps/dashboard-api/src/app.js

Comment thread apps/consumer/Dockerfile
Comment on lines +1 to +20
FROM node:22-alpine

WORKDIR /app

# Copy root package files
COPY package.json package-lock.json ./

# Copy workspace package.json files
COPY packages/common/package.json ./packages/common/
COPY apps/consumer/package.json ./apps/consumer/

RUN npm ci

# Copy actual source code
COPY packages/common ./packages/common
COPY apps/consumer ./apps/consumer

WORKDIR /app/apps/consumer

CMD ["npm", "run", "start"] No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Run the consumer container as a non-root user.

The image currently runs as root, which is an avoidable security risk.

Proposed change
 FROM node:22-alpine
 
 WORKDIR /app
@@
 COPY packages/common ./packages/common
 COPY apps/consumer ./apps/consumer
 
 WORKDIR /app/apps/consumer
+RUN chown -R node:node /app
+USER node
 
 CMD ["npm", "run", "start"]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
FROM node:22-alpine
WORKDIR /app
# Copy root package files
COPY package.json package-lock.json ./
# Copy workspace package.json files
COPY packages/common/package.json ./packages/common/
COPY apps/consumer/package.json ./apps/consumer/
RUN npm ci
# Copy actual source code
COPY packages/common ./packages/common
COPY apps/consumer ./apps/consumer
WORKDIR /app/apps/consumer
CMD ["npm", "run", "start"]
FROM node:22-alpine
WORKDIR /app
# Copy root package files
COPY package.json package-lock.json ./
# Copy workspace package.json files
COPY packages/common/package.json ./packages/common/
COPY apps/consumer/package.json ./apps/consumer/
RUN npm ci
# Copy actual source code
COPY packages/common ./packages/common
COPY apps/consumer ./apps/consumer
WORKDIR /app/apps/consumer
RUN chown -R node:node /app
USER node
CMD ["npm", "run", "start"]
🧰 Tools
🪛 Trivy (0.69.3)

[error] 1-1: Image user should not be 'root'

Specify at least 1 USER command in Dockerfile with non-root user as argument

Rule: DS-0002

Learn more

(IaC/Dockerfile)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/consumer/Dockerfile` around lines 1 - 20, The Dockerfile is creating and
running the container as root; modify it to create a non-root user, chown the
app directory to that user, and switch to that user before CMD so the container
runs unprivileged: add a user creation step (e.g., addgroup/adduser or useradd)
after the workspace and package copy steps, run chown -R on /app (the WORKDIR
used by RUN npm ci and subsequent COPYs) to give ownership to the new user, and
add a USER instruction near the end so CMD ["npm","run","start"] runs as that
non-root user.

Comment on lines +1 to +15
{
"name": "consumer",
"version": "1.0.0",
"description": "",
"main": "src/app.js",
"scripts": {
"dev": "node src/index.js",
"start": "node src/index.js",
"test": "jest --testPathPatterns=src/"
},
"keywords": [],
"author": "",
"license": "ISC",
"type": "commonjs"
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show consumer manifest
echo "---- apps/consumer/package.json ----"
cat -n apps/consumer/package.json

# Identify import/require usage in consumer src
echo "---- imports in apps/consumer/src ----"
rg -n --no-heading "(require\\(|from\\s+['\"]|import\\s+['\"])\\s*('dotenv'|\"dotenv\"|dotenv)|\\bbullmq\\b|`@urbackend/common`" apps/consumer/src || true

# Show consumer src key files referenced
echo "---- list apps/consumer/src ----"
find apps/consumer/src -maxdepth 2 -type f -print

# Inspect monorepo root package.json for workspace/hoisting rules
echo "---- root package.json ----"
ls -1
if [ -f package.json ]; then cat -n package.json; fi

# If a package manager config exists (pnpm/yarn/npm), show workspace relevant sections
for f in pnpm-workspace.yaml yarn.lock package-lock.json pnpm-lock.yaml .npmrc .yarnrc.yml; do
  if [ -e "$f" ]; then
    echo "---- $f ----"
    head -n 80 "$f" || true
  fi
done

Repository: geturbackend/urBackend

Length of output: 4313


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show consumer manifest
echo "---- apps/consumer/package.json ----"
cat -n apps/consumer/package.json

# Identify import/require usage in consumer src
echo "---- imports in apps/consumer/src ----"
rg -n --no-heading "(require\\(|from\\s+['\"]|import\\s+['\"])\\s*('dotenv'|\"dotenv\"|dotenv)|\\bbullmq\\b|`@urbackend/common`" apps/consumer/src || true

# Show consumer src key files referenced
echo "---- list apps/consumer/src ----"
find apps/consumer/src -maxdepth 2 -type f -print

# Inspect monorepo root package.json for workspace/hoisting rules
echo "---- root package.json ----"
ls -1
if [ -f package.json ]; then cat -n package.json; fi

# If a package manager config exists (pnpm/yarn/npm), show workspace relevant sections
for f in pnpm-workspace.yaml yarn.lock package-lock.json pnpm-lock.yaml .npmrc .yarnrc.yml; do
  if [ -e "$f" ]; then
    echo "---- $f ----"
    head -n 80 "$f" || true
  fi
done

Repository: geturbackend/urBackend

Length of output: 4313


Declare consumer runtime dependencies in apps/consumer/package.json.

apps/consumer imports dotenv, @urbackend/common, and bullmq, but its package.json declares none (and package-lock.json shows no dependencies recorded for apps/consumer). This can break module resolution when installing/running the consumer workspace in isolation. Match the versions/protocol used by the other workspaces in this monorepo (e.g., dotenv ^17.2.3, bullmq ^5.70.1, @urbackend/common *).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/consumer/package.json` around lines 1 - 15, Add a "dependencies" section
to apps/consumer/package.json declaring the runtime packages the consumer
imports: add "dotenv": "^17.2.3", "bullmq": "^5.70.1", and "`@urbackend/common`":
"*" (to match the monorepo protocol/versions used elsewhere), then run the
workspace install to update the lockfile; update any existing script or import
usage if package names differ. Ensure the "dependencies" key is present
alongside "scripts" and uses the exact versions listed so module resolution
works when the consumer workspace is installed/run in isolation.

Comment thread apps/consumer/src/workers/export.worker.js
Comment thread apps/consumer/src/workers/export.worker.js Outdated
Comment on lines +40 to +50
const currentCount = await redis.get(key);
if (currentCount && Number(currentCount) >= maxExports) {
return next(new AppError(429, `Daily export limit reached (${maxExports}/${maxExports}). Please try again tomorrow.`));
}

const newCount = await redis.incr(key);
if (newCount === 1) {
await redis.expire(key, 86400); // Set expiry to 24 hours
}

await exportQueue.add('export-database', { projectId, userId, email });
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make rate-limit increment/check atomic.

get + incr is race-prone; parallel requests can exceed the daily limit before either request is rejected.

Proposed change
-        const currentCount = await redis.get(key);
-        if (currentCount && Number(currentCount) >= maxExports) {
-            return next(new AppError(429, `Daily export limit reached (${maxExports}/${maxExports}). Please try again tomorrow.`));
-        }
-
         const newCount = await redis.incr(key);
         if (newCount === 1) {
             await redis.expire(key, 86400); // Set expiry to 24 hours
         }
+        if (newCount > maxExports) {
+            return next(new AppError(429, `Daily export limit reached (${maxExports}/${maxExports}). Please try again tomorrow.`));
+        }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/dashboard-api/src/controllers/dbExport.controller.js` around lines 40 -
50, The current get + incr sequence is race-prone; replace it with a single
atomic Redis operation (e.g. an EVAL Lua script or a Redis multi/transaction)
that increments the key, sets the 24h expiry when the counter becomes 1, and
returns the new count in one call; then check the returned count against
maxExports and call next(new AppError(...)) if it exceeds the limit before
calling exportQueue.add. Target the existing symbols key, redis.get/redis.incr,
maxExports, and exportQueue.add and ensure the atomic script does: INCR key, if
value == 1 then EXPIRE key 86400, return value.

Comment on lines +52 to +54
return res.status(202).json({
message: `Database export request received. You will receive an email with a download link shortly. Usage today: ${newCount}/${maxExports}.`,
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Return the standardized API response shape.

This controller should return { success, data, message } instead of a message-only payload.

Proposed change
-        return res.status(202).json({
-            message: `Database export request received. You will receive an email with a download link shortly. Usage today: ${newCount}/${maxExports}.`,
-        });
+        return res.status(202).json({
+            success: true,
+            data: {
+                usage: {
+                    used: newCount,
+                    limit: maxExports,
+                    remaining: Math.max(0, maxExports - newCount),
+                },
+            },
+            message: `Database export request received. You will receive an email with a download link shortly.`,
+        });
As per coding guidelines, `**/src/controllers/**/*.{js,ts}` requires all API endpoints to return `{ success: bool, data: {}, message: "" }`.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/dashboard-api/src/controllers/dbExport.controller.js` around lines 52 -
54, The response currently returns a message-only payload; update the
res.status(202).json call in the db export controller to follow the standardized
shape { success, data, message } — e.g. return res.status(202).json({ success:
true, data: { usageToday: `${newCount}/${maxExports}`, newCount, maxExports },
message: "Database export request received. You will receive an email with a
download link shortly." }); ensure you modify the existing return in the export
handler (the res.status(202).json(...) statement) so callers receive success,
data, and message fields.

Comment on lines +57 to +58
console.error(`[Dashboard API] Error handling export request for project ${req.params.projectId}:`, err);
return next(new AppError(500, err.message || "Failed to initiate database export."));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not expose raw internal error messages to clients.

Propagating err.message can leak MongoDB/internal details in API responses.

Proposed change
-        return next(new AppError(500, err.message || "Failed to initiate database export."));
+        return next(new AppError(500, "Failed to initiate database export."));
As per coding guidelines, `**/src/controllers/**/*.{js,ts}` says: `Use AppError class for errors — never raw throw, never expose MongoDB errors to client.`
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
console.error(`[Dashboard API] Error handling export request for project ${req.params.projectId}:`, err);
return next(new AppError(500, err.message || "Failed to initiate database export."));
console.error(`[Dashboard API] Error handling export request for project ${req.params.projectId}:`, err);
return next(new AppError(500, "Failed to initiate database export."));
🧰 Tools
🪛 GitHub Check: CodeQL

[failure] 57-57: Use of externally-controlled format string
Format string depends on a user-provided value.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/dashboard-api/src/controllers/dbExport.controller.js` around lines 57 -
58, The controller currently logs the full error but constructs AppError using
err.message which can leak internal/MongoDB details; keep the console.error for
internal diagnostics (including err) but change the AppError instantiation to
return a generic client-safe message (e.g., "Failed to initiate database
export.") instead of err.message, optionally attach an internal error id or
err.code to logs only; update the code around console.error and the new
AppError(...) call so next(new AppError(500, ...)) never includes err.message
while still logging err for debugging (referencing AppError, console.error,
req.params.projectId, next in the db export handler).

Comment on lines +11 to +30
if (job.name === 'release-email') {
const { email, version, title, content, changelogUrl } = job.data;
try {
console.log(`[Queue] Processing Release email for: ${email}`);
await sendReleaseEmail(email, { version, title, content, changelogUrl });
} catch (error) {
console.error(`[Queue] Failed to send email to ${email}:`, error);
throw error;
}
}

if (job.name === 'send-export-email') {
const { email, downloadUrl, projectName } = job.data;

console.log(`[EmailWorker] Sending simple export email to ${email} for ${projectName}`);

await sendExportReadyEmail({ to:email, downloadUrl, projectName});

console.log(`[EmailWorker] Export email successfully sent to ${email}`);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail unsupported email job types explicitly.

Right now, unknown job.name values are acknowledged as successful no-ops, which can silently drop emails.

Proposed change
-        if (job.name === 'release-email') {
+        if (job.name === 'release-email') {
             const { email, version, title, content, changelogUrl } = job.data;
             try {
                 console.log(`[Queue] Processing Release email for: ${email}`);
                 await sendReleaseEmail(email, { version, title, content, changelogUrl });
             } catch (error) {
                 console.error(`[Queue] Failed to send email to ${email}:`, error);
                 throw error;
             }
-        }
-
-        if (job.name === 'send-export-email') {
+        } else if (job.name === 'send-export-email') {
             const { email, downloadUrl, projectName } = job.data;
 
             console.log(`[EmailWorker] Sending simple export email to ${email} for ${projectName}`);
 
             await sendExportReadyEmail({ to:email, downloadUrl, projectName});
             
             console.log(`[EmailWorker] Export email successfully sent to ${email}`);
+        } else {
+            throw new Error(`Unsupported email job type: ${job.name}`);
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (job.name === 'release-email') {
const { email, version, title, content, changelogUrl } = job.data;
try {
console.log(`[Queue] Processing Release email for: ${email}`);
await sendReleaseEmail(email, { version, title, content, changelogUrl });
} catch (error) {
console.error(`[Queue] Failed to send email to ${email}:`, error);
throw error;
}
}
if (job.name === 'send-export-email') {
const { email, downloadUrl, projectName } = job.data;
console.log(`[EmailWorker] Sending simple export email to ${email} for ${projectName}`);
await sendExportReadyEmail({ to:email, downloadUrl, projectName});
console.log(`[EmailWorker] Export email successfully sent to ${email}`);
}
if (job.name === 'release-email') {
const { email, version, title, content, changelogUrl } = job.data;
try {
console.log(`[Queue] Processing Release email for: ${email}`);
await sendReleaseEmail(email, { version, title, content, changelogUrl });
} catch (error) {
console.error(`[Queue] Failed to send email to ${email}:`, error);
throw error;
}
} else if (job.name === 'send-export-email') {
const { email, downloadUrl, projectName } = job.data;
console.log(`[EmailWorker] Sending simple export email to ${email} for ${projectName}`);
await sendExportReadyEmail({ to:email, downloadUrl, projectName});
console.log(`[EmailWorker] Export email successfully sent to ${email}`);
} else {
throw new Error(`Unsupported email job type: ${job.name}`);
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/common/src/queues/emailQueue.js` around lines 11 - 30, The worker
currently treats unknown job.name values as silent no-ops; update the handler so
that after the known branches (release-email and send-export-email) it
explicitly rejects unsupported job types by throwing an Error (or returning a
failed Promise) that includes the unrecognized job.name and job.data for
observability; modify the function that processes jobs (the code handling
job.name in packages/common/src/queues/emailQueue.js) to add a final else/throw
path that logs and throws a descriptive error like "Unsupported email job type:
<job.name>" so unknown jobs fail loudly.

Comment on lines 145 to 147
throw new Error("Unknown storage provider: " + provider);
console.error("[getStorage] Unknown storage provider: ", provider);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

ls -la
sed -n '120,190p' packages/common/src/utils/storage.manager.js

Repository: geturbackend/urBackend

Length of output: 4513


Fix unreachable console.error in unknown storage provider path

In packages/common/src/utils/storage.manager.js, the throw new Error(...) happens before console.error(...), so the provider diagnostic log never runs.

Suggested fix
             } else {
-                throw new Error("Unknown storage provider: " + provider);
                 console.error("[getStorage] Unknown storage provider: ", provider);
+                throw new Error("Unknown storage provider: " + provider);
             }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
throw new Error("Unknown storage provider: " + provider);
console.error("[getStorage] Unknown storage provider: ", provider);
}
console.error("[getStorage] Unknown storage provider: ", provider);
throw new Error("Unknown storage provider: " + provider);
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/common/src/utils/storage.manager.js` around lines 145 - 147, The
unknown-provider branch in getStorage (storage.manager.js) logs the provider
after throwing an Error, making console.error unreachable; move or duplicate the
diagnostic call so the provider is logged before the exception is thrown (i.e.,
call console.error("[getStorage] Unknown storage provider: ", provider) prior to
throw new Error("Unknown storage provider: " + provider)) so the provider is
recorded when getStorage encounters an unsupported provider.

@renganathc
Copy link
Copy Markdown
Contributor Author

renganathc commented May 21, 2026

Ahh, these are quick fixes. Also, @yash-pouranik I wanted to have a discussion on how to proceed further with the storage bucket streaming as supabase doesn't do a good job with streaming. From my homework AWS S3 however has great streaming capabilities. Do I maybe, check for storage provider and accordingly use an appropriate upload pipeline. (local and upload / direct stream)

In the current setup, I optimized it by streaming to a temp file which wouldn't use up memory as it doesn't load up the entire file on memory and stream to it. (Streams to file on disk)

@yash-pouranik
Copy link
Copy Markdown
Collaborator

yash-pouranik commented May 22, 2026

really sry for too much delay. Will join at around 4 or 5 pm
@renganathc

@renganathc
Copy link
Copy Markdown
Contributor Author

Sure, no worries :)

@renganathc
Copy link
Copy Markdown
Contributor Author

Hi @yash-pouranik. Could we have a quick chat on how to proceed further. There are a few things we should get sorted out.

@yash-pouranik
Copy link
Copy Markdown
Collaborator

Really sry, i literally forgot about this.
just reading everythin.
just 5 mins

@yash-pouranik
Copy link
Copy Markdown
Collaborator

So u mean current approach is, creating a file on disc and then streaming all the data to that file only?
Look right nowwe dont have dedicated VPS, we are just running on free render tier. and we have 50$ credit from render as a token to work on this proj but will use that after proper product will be ready.
So for now
in urbackend
we have to handle streaming in 3 types of storage

  • R2
  • S3
  • Supabase that is S3 compatible

So for projects who have S3 or R2 Configured? We provide the same streaming capabilities. as we have discussed in the issue that is creating file on cloud and streaming on that only.
and for Supabase,
there is only 1 option as I can see according to my searches:

  • supabse cloud is S3 compatible so we can use aws/client sdk to upload files on supabase just by adding supabase cloud endopint and other credentials

What say?
@renganathc
@coderabbitai

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 23, 2026

Tip

For best results, initiate chat on the files or code changes.

@yash-pouranik Great analysis! Here's a detailed take on each point:


✅ Yes, current approach = stream to disk first, then upload

That's exactly what export.worker.js does — it streams the Mongoose cursor into a temp JSON file on disk (memory-safe), then reads that file and uploads it to Supabase. The disk write is fine for memory, but on Render's free tier the ephemeral disk is very limited (512 MB), so for large exports this could be a problem.


🎯 Recommended: One unified AWS S3-compatible streaming pipeline for all 3 providers

Here's the good news — you don't need separate pipelines. All three providers (S3, R2, Supabase) speak the S3 protocol and can use @aws-sdk/lib-storage's Upload helper, which accepts a Node.js stream as Body and handles multipart upload automatically under the hood. This means you can stream directly from the Mongoose cursor → S3-compatible multipart upload, skipping the temp file entirely.

Supabase Storage is now officially an S3-compatible storage provider and supports multipart uploads via the S3 protocol. Multipart uploads split the file into smaller parts and upload them in parallel; this allows retrying individual parts on network issues and is the preferable method for server-side uploads when you want to maximize upload speed.

The AWS SDK's managed upload intelligently detects when a stream can be split into multiple parts, provides more robust operations (individual parts can be retried separately), allows parallel uploads, and — most importantly — does not need to know the full size of the stream in advance.


How to configure each provider

const { S3Client } = require("`@aws-sdk/client-s3`");
const { Upload } = require("`@aws-sdk/lib-storage`");

function getS3Client(project) {
  const { storageProvider, storageConfig } = project;

  if (storageProvider === "s3") {
    return new S3Client({
      region: storageConfig.region,
      credentials: {
        accessKeyId: storageConfig.accessKeyId,
        secretAccessKey: storageConfig.secretAccessKey,
      },
    });
  }

  if (storageProvider === "r2") {
    return new S3Client({
      region: "auto",
      endpoint: `https://${storageConfig.accountId}.r2.cloudflarestorage.com`,
      credentials: {
        accessKeyId: storageConfig.accessKeyId,
        secretAccessKey: storageConfig.secretAccessKey,
      },
    });
  }

  if (storageProvider === "supabase") {
    // Generate S3 access keys from: Supabase Dashboard → Storage → Settings → S3 Access Keys
    return new S3Client({
      region: storageConfig.region ?? "ap-south-1", // any valid region works
      endpoint: `${storageConfig.projectUrl}/storage/v1/s3`,
      credentials: {
        accessKeyId: storageConfig.s3AccessKeyId,
        secretAccessKey: storageConfig.s3SecretAccessKey,
      },
      forcePathStyle: true,
    });
  }

  throw new Error(`Unsupported storage provider: ${storageProvider}`);
}

Streaming export pipeline (no temp file needed)

const { PassThrough } = require("stream");
const { Upload } = require("`@aws-sdk/lib-storage`");

async function streamExportToStorage(project, mongooseModel) {
  const s3Client = getS3Client(project);
  const passThrough = new PassThrough();

  // Upload starts immediately, consuming chunks as they arrive
  const upload = new Upload({
    client: s3Client,
    params: {
      Bucket: project.storageBucket,
      Key: `exports/db_export_${Date.now()}.json`,
      Body: passThrough,
      ContentType: "application/json",
    },
    queueSize: 4,        // concurrent parts
    partSize: 5 * 1024 * 1024, // 5 MB minimum per S3 spec
  });

  // Stream Mongoose cursor into PassThrough
  const cursor = mongooseModel.find().lean().cursor();
  passThrough.write("[");
  let first = true;
  for await (const doc of cursor) {
    if (!first) passThrough.write(",");
    passThrough.write(JSON.stringify(doc));
    first = false;
  }
  passThrough.write("]");
  passThrough.end();

  const result = await upload.done();
  return result.Key;
}

Summary of the approach

Provider S3 SDK endpoint config Streaming support
AWS S3 Standard (no endpoint needed) ✅ Native
Cloudflare R2 https://<accountId>.r2.cloudflarestorage.com ✅ Via S3 protocol
Supabase <projectUrl>/storage/v1/s3 + S3 keys from Storage Settings ✅ Via S3 protocol

To authenticate with Supabase S3, you can use standard access_key and secret_key credentials generated from the Storage Settings page — this method is widely compatible with tools supporting the S3 protocol and is meant to be used exclusively server-side.

This eliminates the temp file entirely, avoids ephemeral disk pressure on Render's free tier, and gives you one unified upload path for all three providers. The only thing to store per-project in config is the S3 credentials variant (Supabase's S3 keys are separate from its API key).

🐰 ✨

@renganathc
Copy link
Copy Markdown
Contributor Author

I went through the native Supabase Storage API and found that it wasn't streaming compatible. I'll take a look at Supabase's S3 compatible gateway now. If it provides a way to stream directly, I'll make the changes and update the PR. Sounds good ?

@yash-pouranik
Copy link
Copy Markdown
Collaborator

yeah sure,
and u can just blindly follow coderabbits comment.
And I confirm that each cloud providers support their usage through aws client sdk

@renganathc
Copy link
Copy Markdown
Contributor Author

Awsm, I'll make the changes and get back to you soon :)

@yash-pouranik
Copy link
Copy Markdown
Collaborator

bro Waiting... @renganathc

@renganathc renganathc force-pushed the feature/db-export-pipeline branch from 4b50866 to 0e71cd0 Compare May 27, 2026 01:37
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
apps/consumer/src/workers/export.worker.js (1)

62-66: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle backpressure when writing to PassThrough stream.

The cursor loop ignores write() return values. For large exports (100k+ documents per issue requirements), the PassThrough buffer can grow unbounded if the S3 Upload consumes slower than the cursor produces.

Proposed fix with backpressure handling
+const { once } = require('events');
 // ... at top of file

 for await (const doc of cursor) {
-    if (!first) passThrough.write(',\n');
-    passThrough.write(`    ${JSON.stringify(doc)}`);
+    if (!first) {
+        if (!passThrough.write(',\n')) await once(passThrough, 'drain');
+    }
+    if (!passThrough.write(`    ${JSON.stringify(doc)}`)) {
+        await once(passThrough, 'drain');
+    }
     first = false;
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/consumer/src/workers/export.worker.js` around lines 62 - 66, The cursor
loop writing to the PassThrough stream ignores write() return values and can
overflow the buffer; update the loop that iterates over cursor (the for await
(const doc of cursor) block) to check the boolean result of
passThrough.write(...) and when it returns false await the 'drain' event before
continuing (e.g., await new Promise(resolve => passThrough.once('drain',
resolve))). Ensure this backpressure handling is applied for the initial '[' and
subsequent comma + JSON writes so the export.worker.js passThrough stream never
grows unbounded.
packages/common/src/utils/storage.manager.js (1)

145-146: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Unreachable console.error after throw.

The throw statement on line 145 executes before the console.error on line 146, making the log unreachable. Swap the order.

Proposed fix
             } else {
+                console.error("[getStorage] Unknown storage provider: ", provider);
                 throw new Error("Unknown storage provider: " + provider);
-                console.error("[getStorage] Unknown storage provider: ", provider);
             }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/common/src/utils/storage.manager.js` around lines 145 - 146, The
console.error after the throw is unreachable; in the getStorage flow log the
unknown provider before raising the exception by moving the
console.error("[getStorage] Unknown storage provider: ", provider) to precede
the throw new Error("Unknown storage provider: " + provider) (i.e., log the
provider inside the getStorage branch then throw), so the runtime will emit the
diagnostic before the exception is thrown.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/common/src/utils/storage.manager.js`:
- Around line 288-296: getS3CompatibleStorage() is setting endpoint to
process.env.SUPABASE_URL (e.g. https://<project>.supabase.co) which is wrong for
Supabase’s S3 API; change the endpoint to the Supabase Storage S3 endpoint
(https://<project-ref>.storage.supabase.co/storage/v1/s3) instead. Update the
logic in getS3CompatibleStorage() to derive or use the correct storage host
(either build from SUPABASE_PROJECT_REF or parse SUPABASE_URL) and ensure the
endpoint includes the /storage/v1/s3 path while keeping region, forcePathStyle
and credentials usage intact so AWS SDK S3 calls hit the internal storage API.

---

Duplicate comments:
In `@apps/consumer/src/workers/export.worker.js`:
- Around line 62-66: The cursor loop writing to the PassThrough stream ignores
write() return values and can overflow the buffer; update the loop that iterates
over cursor (the for await (const doc of cursor) block) to check the boolean
result of passThrough.write(...) and when it returns false await the 'drain'
event before continuing (e.g., await new Promise(resolve =>
passThrough.once('drain', resolve))). Ensure this backpressure handling is
applied for the initial '[' and subsequent comma + JSON writes so the
export.worker.js passThrough stream never grows unbounded.

In `@packages/common/src/utils/storage.manager.js`:
- Around line 145-146: The console.error after the throw is unreachable; in the
getStorage flow log the unknown provider before raising the exception by moving
the console.error("[getStorage] Unknown storage provider: ", provider) to
precede the throw new Error("Unknown storage provider: " + provider) (i.e., log
the provider inside the getStorage branch then throw), so the runtime will emit
the diagnostic before the exception is thrown.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 26a627e0-1a6b-4792-9e8c-9d60ea97a3d2

📥 Commits

Reviewing files that changed from the base of the PR and between 4b50866 and 0e71cd0.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (4)
  • apps/consumer/src/workers/export.worker.js
  • package.json
  • packages/common/src/index.js
  • packages/common/src/utils/storage.manager.js
🚧 Files skipped from review as they are similar to previous changes (1)
  • package.json

Comment on lines +288 to +296
const s3Client = new S3Client({
region: "auto",
endpoint: process.env.SUPABASE_URL,
forcePathStyle: true,
credentials: {
accessKeyId: process.env.SUPABASE_PUBLIC_KEY,
secretAccessKey: process.env.SUPABASE_SECRET_KEY
}
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

Supabase S3 compatible API endpoint format storage v1 s3

💡 Result:

Supabase Storage “S3 compatible API endpoint” format (Storage S3 protocol) is: https://.storage.supabase.co/storage/v1/s3 The path portion is exactly “/storage/v1/s3” and is described as the S3 protocol endpoint exposed by Storage [1][2]. Supabase’s S3 authentication docs show the full example endpoint including the “storage.supabase.co” hostname and “/storage/v1/s3” path [3][4]. Note: This is different from the Supabase Storage REST API base URL, which is: https://.supabase.co/storage/v1/ and is used for the REST Storage API [5].

Citations:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show the relevant portion of the file around the flagged lines
sed -n '240,330p' packages/common/src/utils/storage.manager.js

# Find any other references to SUPABASE_URL or the s3 endpoint logic
rg -n "SUPABASE_URL|storage/v1/s3|S3Client|getS3CompatibleStorage" packages/common/src/utils/storage.manager.js packages/common/src/utils -S || true

# If there are env var docs elsewhere, try to locate them
rg -n "SUPABASE_PUBLIC_KEY|SUPABASE_SECRET_KEY|SUPABASE_URL" -S . || true

Repository: geturbackend/urBackend

Length of output: 6282


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect top of storage.manager.js to see how SUPABASE_URL is treated
sed -n '1,120p' packages/common/src/utils/storage.manager.js

# Inspect internal S3 gateway block a bit more fully
sed -n '250,430p' packages/common/src/utils/storage.manager.js

# Check where SUPABASE_URL is provided to Supabase clients (apps)
sed -n '1,120p' apps/public-api/src/app.js
sed -n '70,140p' apps/dashboard-api/src/app.js

# Search for any documentation or env validation about SUPABASE_URL
rg -n "SUPABASE_URL" -S . || true
rg -n "SUPABASE_PUBLIC_KEY|SUPABASE_SECRET_KEY" -S . || true

# Look for storage S3 endpoint construction patterns in repo
rg -n "storage/v1/s3|storage\\.supabase\\.co|s3Endpoint" -S . || true

Repository: geturbackend/urBackend

Length of output: 17176


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show the env defaults/any constants at the top of storage.manager.js
sed -n '1,80p' packages/common/src/utils/storage.manager.js

# Show internal supabase block fully (around internal S3 client creation)
sed -n '260,360p' packages/common/src/utils/storage.manager.js

# Show the input validation for s3Endpoint (external supabase via s3 gateway config)
sed -n '360,480p' packages/common/src/utils/input.validation.js

# Show the UI hints/placeholder for s3Endpoint (to infer expected format)
sed -n '880,1100p' apps/web-dashboard/src/pages/ProjectSettings.jsx

Repository: geturbackend/urBackend

Length of output: 24488


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find the BYOS storage config patch route
rg -n "byod-config|resources\\.storage|storageProvider\"|storageUrl|storageKey" apps packages -S

# Inspect the storage routes for where config is created/normalized
fd -e js -e ts routes apps | head -n 50 || true
fd byod-config -t f . || true
fd storage -t f apps packages/common/src/utils || true

# Specifically open the project storage config routes (best-effort by filename)
rg -n "byod-config" apps -S

Repository: geturbackend/urBackend

Length of output: 9884


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show how external BYOS storage config is persisted (storageUrl/storageKey -> resources.storage.config)
sed -n '470,610p' apps/dashboard-api/src/controllers/project.controller.js

# Also show the updateExternalConfig handler call site (if it does normalization)
sed -n '600,710p' apps/dashboard-api/src/controllers/project.controller.js

# Locate the validator usage for byos storage config
rg -n "updateExternalConfig|storageUrl|storageKey|storageProvider|s3Endpoint" apps/dashboard-api/src/controllers/project.controller.js apps/dashboard-api/src/middlewares apps/dashboard-api/src/routes -S

# Find env examples / docs for SUPABASE_URL and internal storage
ls -la
fd -a -t f ".env*example" . || true
fd -a -t f "README*|*.md" . | head -n 50
rg -n "SUPABASE_URL|SUPABASE_PUBLIC_KEY|SUPABASE_SECRET_KEY" -S . || true

Repository: geturbackend/urBackend

Length of output: 10869


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect env example for SUPABASE_URL / internal storage expectations
rg -n "SUPABASE_URL|SUPABASE_PUBLIC_KEY|SUPABASE_SECRET_KEY" .env.example
sed -n '1,120p' .env.example

# Search for any mention of /storage/v1/s3 in repo
rg -n "/storage/v1/s3|storage/v1/s3|storage\\.supabase\\.co|\\.storage\\.supabase\\.co" -S .

# Inspect storage.manager.js earlier sections that parse resources.storage.config for upload/presign/head object
sed -n '80,220p' packages/common/src/utils/storage.manager.js
sed -n '220,320p' packages/common/src/utils/storage.manager.js
sed -n '320,430p' packages/common/src/utils/storage.manager.js

# Inspect input.validation storageUrl/s3Endpoint expectations more directly
sed -n '360,470p' packages/common/src/utils/input.validation.js

# Inspect storage controllers tests that might construct endpoints
rg -n "storageUrl|s3Endpoint|SUPABASE_URL|storage/v1/s3" apps -S

Repository: geturbackend/urBackend

Length of output: 2407


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Where getS3CompatibleStorage is used
rg -n "getS3CompatibleStorage\\(" packages/common/src/utils/storage.manager.js packages/common/src -S

# Show internal Supabase S3 client creation block with line context
sed -n '260,320p' packages/common/src/utils/storage.manager.js

# Show verifyUploadedFile / head object usage context (where returned s3Client is used)
sed -n '320,420p' packages/common/src/utils/storage.manager.js

Repository: geturbackend/urBackend

Length of output: 3521


🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '100,260p' packages/common/src/utils/storage.manager.js
sed -n '260,380p' packages/common/src/utils/storage.manager.js

Repository: geturbackend/urBackend

Length of output: 9051


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n "getS3CompatibleStorage\\(" -S apps packages

Repository: geturbackend/urBackend

Length of output: 278


🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '1,160p' apps/consumer/src/workers/export.worker.js

# Also search for any other S3 usage in that worker
rg -n "HeadObjectCommand|ListObjects|GetObject|verifyUploadedFile|getS3CompatibleStorage|SUPABASE_URL|storage/v1/s3" apps/consumer/src/workers/export.worker.js -S

Repository: geturbackend/urBackend

Length of output: 4114


🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "storage\\.supabase\\.co|storage/v1/s3|/storage/v1/s3|supabase\\.co/storage/v1" -S . || true

Repository: geturbackend/urBackend

Length of output: 615


Fix internal Supabase S3 endpoint for AWS SDK.
getS3CompatibleStorage() sets endpoint: process.env.SUPABASE_URL (e.g., https://<project>.supabase.co), but Supabase’s S3-compatible endpoint must be https://<project-ref>.storage.supabase.co/storage/v1/s3 (including the storage.* host and /storage/v1/s3 path). With the current value, the AWS SDK calls will hit the wrong URL, breaking internal S3 operations like the export worker upload + signed download URL. [packages/common/src/utils/storage.manager.js:288-296]

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/common/src/utils/storage.manager.js` around lines 288 - 296,
getS3CompatibleStorage() is setting endpoint to process.env.SUPABASE_URL (e.g.
https://<project>.supabase.co) which is wrong for Supabase’s S3 API; change the
endpoint to the Supabase Storage S3 endpoint
(https://<project-ref>.storage.supabase.co/storage/v1/s3) instead. Update the
logic in getS3CompatibleStorage() to derive or use the correct storage host
(either build from SUPABASE_PROJECT_REF or parse SUPABASE_URL) and ensure the
endpoint includes the /storage/v1/s3 path while keeping region, forcePathStyle
and credentials usage intact so AWS SDK S3 calls hit the internal storage API.

@renganathc renganathc force-pushed the feature/db-export-pipeline branch from 0e71cd0 to 414fe0a Compare May 27, 2026 02:01
@renganathc
Copy link
Copy Markdown
Contributor Author

Hello @yash-pouranik. Sorry for the delay. I was travelling. I've made the changes and the pass through works as intended. However, there are a few things I wanted to discuss regarding the supabase S3 gateway I worked on. Currently, the entire project uses the default supabase SDK for uploads. So, I couldn't use the getProject() fn to return a client and instead created an S3 based one to return an S3 client.

The problem with this is, the user has to register with the S3 config (url, access key and secret key) instead of the usual config while using BYOS. This means we would have to update our entire infrastructure to use S3.

(This is related to CodeRabbit flag as well. It is pointing out the ambiguity with process.env.SUPABASE_URL)

My suggestion (for a quick fix) would be to GET both S3 configs and supabase configs while integrating BYOS so that we would have both and can use either the S3 gateway or the default supabase SDK for uploads. In the future we could completely migrate to S3 based uploads to avoid any inconsistencies.

What do you think ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Database Backup & Export Pipeline (Async Workers & Streams)

3 participants